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Abstract: Content Based Image Retrieval (CBIR) is an automatic process to search relevant images based on user input. 
The input could be parameters, sketches or example images. A typical CBIR process first extracts the image features and 
store them efficiently. Then it compares with images from the database and returns the results. Besides reading existing 
information, which already exists in digital photographs, and transformation of this information to MPEG -7, RF supports 
the creation of new metadata. Semantic information about the image is presented as directed graph, where the nodes reflect 
semantic objects, locations, agents, states, times or concepts and the edges define the relations between these semantic 
entities. To enhance retrieval efficiency content-based metadata is extracted and new instances of the image for faster 
visualization, like thumbnails, are created. The MPEG-7 description consists of the following parts: metadata description, 
creation information, media information, textual annotation, semantics, and visual descriptors. 

Index Term: Text classification; Feature selection; K-Nearest Neighbor; Naive Bayesian 

I. INTRODUCTION 

Content-Based image retrieval (CBIR) systems analyze the visual content description to organize and find images in 
databases. The retrieval process usually relies on presenting a visual query (natural or synthetic) to the systems, and 
extracting from a database the set of images that best fit the user request. Such mechanism, referred to as query-by-example, 
requires the definition of an image representation (a set of descriptive features) and of some similarity metrics to compare 
query and target images. 

Several years of research in this field highlighted a number of problems related to this (apparently simple) process. 

According to this, several additional mechanisms have been introduced to achieve better performance. Among 
them, relevance feedback (RF) proved to be a powerful tool to iteratively collect information from the user and transform it 
into a semantic bias in the retrieval process. RF increases the retrieval performance thanks to the fact that it enables the 
system to learn what is relevant or irrelevant to the user across successive retrieval-feedback cycles. 

II. CONTENT BASED IMAGE RETRIEVAL 

Content Based Image Retrieval (CBIR) is an automatic process to search relevant images based on user input. The 
input could be parameters, sketches or example images. A typical CBIR process first extracts the image features and store 
them efficiently. Then it compares with images from the database and returns the results. Besides reading existing 
information, which already exists in digital photographs, and transformation of this information to MPEG-7, RF supports the 
creation of new metadata. Semantic information about the image is presented as directed graph, where the nodes reflect 
semantic objects, locations, agents, states, times or concepts and the edges define the relations between these semantic 
entities. 

The experimental metadata based image retrieval; Different types of retrieval mechanisms are supported: 

• Content based image retrieval using the MPEG-7 descriptors Color Layout, Edge Histogram and Scalable Color. 

• Graph based retrieval supporting wildcards for semantic relations and semantic objects. 

• 2D data repository and result set visualization based on Fast Map & FDP algorithms. 

Algorithms of feature extraction and similarity measure are very dependent on the features used. In each feature, 
there would be more than one representation. Among these representations, histogram is the most commonly used technique 
to describe features. 

This introduces the techniques of CBIR, including a review on features used, feature representation and similarity measure. 

1. Scalable Color Extraction Process 

2. Color Layout Extraction Process 

3. Edge Histogram Extraction Process 

4. Scalable+ Color Extraction Process 

5. Scalable+Color+EdgeExtraction Process. 

2.1 RELEVANCE FEEDBACK 

Here you can annotate the photos with free text and you can rate the quality. Pre existing metadata like EXIF or 
IPTC tags inside images is loaded and converted to MPEG-7. Here the second panel, the semantic description panel, is 
shown. It offers a tool for visual creation of MPEG-7 based semantic descriptions using a drawn directed graph. On the 
feature extraction panel the low-level descriptors that are automatically extracted are shown. Extracted MPEG-7 descriptors 
are Colour Layout, Scalable Colour and Edge Histogram. 
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The Characteristics of the Relevance Feedback 

Since the general assumption is that every user.s need is different and time varying, the database cannot adopt a 
fixed clustering structure; and the total number of classes and the class membership are not available before-hand since these 
are assumed to be user-dependent and time varying as well. Of course, these rather extreme assumptions can be relaxed in a 
real-world application to the degree of choice. 

A typical scenario for relevance feedback in content-based image retrieval is as follows: 

Step 1. Machine provides initial retrieval results, through query-by-keyword, sketch, or example, etc. 

Step 2. User provides judgment on the currently displayed images as to whether, and to what degree, they are relevant or 

irrelevant to her/his request. 

Step 3. Machine learns and tries again. Go to step 2. If each image/region is represented by a point in a feature space, 
relevance feedback with only positive (i.e., relevant) examples can be cast as a density estimation or novelty detection 
.Scholkopf problem; while with both positive and negative training examples it becomes a classification problem, or an on- 
line learning problem in a batch mode, but with the following characteristics associated with this specific application 
scenario: 

As MPEG-7 is a complex XML based standard, it would be no good idea to confront the user with a XML editor 
and an instruction manual as tools for expressing the semantics of a photo. As a result "Image RF", was designed for 
supporting the user in the time consuming task of annotating photos. 




Figl: Simplified RF UML diagram 

Central part of RF is the so called "semantic description panel". It allows the user to define semantic objects like 
agents, places, events and times which are saved on exit for reusing them the next time starting RF. These semantic objects 
can also be imported from an existing MPEG-7 file to allow exchange of objects between users and editing and creating 
those objects in a user preferred tool. 




■<") 



Fig 2. Loading an image for RF 

Further a whole series can be pre-annotated for simplifying and speeding up the task of annotating multiple ii 
All images within the same context are placed in one file system folder and the user opens the first one using RF. 
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2.2 EDGE HISTOGRAM IMPLEMENTATION 

MPEG-7 Visual Standard specifies a set of descriptors that can be used to measure similarity in images or video. 
Among them, the Edge Histogram Descriptor describes edge distribution with a histogram based on local edge distribution in 
an image. Since the Edge Histogram Descriptor recommended for the MPEG-7 standard represents only local edge 
distribution in the image, the matching performance for image retrieval may not be satisfactory. This paper proposes the use 
of global and semi-local edge histograms generated directly from the local histogram bins to increase the matching 
performance. 

The image array is divided into DCT 4x4 sub images. Each sub image is further partitioned into non-overlapping 
square image blocks whose size depends on the resolution of the input image. The edges in each image -block are categorized 
into one of the following six types: vertical, horizontal, 45+ diagonal, 135+ diagonal, non directional Edge and no-edge. 
Now a 5 -bin edge histogram of each sub image can be obtained. Each bin value is normalized by the total number of image - 
blocks in the sub image. The normalized bin values are nonlinearly quantized edge. 




d) 135 degree edge e) non-oirtaiuiil '. Jgt. 

Fig3: Five different Edge Value 

In this case, that particular image -block does not contribute to any of the 5 edge bins. Consequently, each image- 
block is classified into one of the 5 types of edge blocks or a nonedge block. Although the nonedge blocks do not contribute 
to any histogram bins, each histogram bin value is normalized by the total number of image -blocks including the nonedge 
blocks. This implies that the summation of all histogram bin values for each sub-image is less than or equal to 1 . 

2.3 COLOR LAYOUT IMPLEMENTATION 

MPEG-7 Visual Standard specifies a set of descriptors that can be used to retrieve similar images from digital photo 
repository. Among them, the Color Layout Descriptor (CLD) represents the spatial distribution of colors in an image. The 
Edge Histogram Descriptor (EHD) describes edge distribution with a histogram based on local edge distribution in an image. 
These two features are very powerful features for CBIR systems, especially sketch-based image retrieval. Further, combining 
color and texture features in CBIR systems leads to more accurate results for image retrieval. In both the Color Layout 
Descriptor (CLD) and the Edge Histogram Descriptor (EHD), image features like color and edge distribution can be 
localized in separate 4x4 sub-images. 

The image array is partitioned into 8x8 blocks. Representative colours are selected and expressed in YCbCr colour 
space. Each of the three components (Y, Cb and Cr) is transformed by 8x8 DCT (Discrete Cosine Transform). The resulting 
sets of DCT coefficients are zigzag-scanned and the firs few coefficients are nonlinearly quantized to form the descriptor. 
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Fig4:the CLD extraction process 
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Stepl: The image was loaded using opencv and the width and height of the image was obtained, from which the block width 
and block height of the CLD were calculated by dividing by 8. The division was done using truncation, so that if the image 
dimensions were not divisible by 8, the outermost pixels are not considered in the descriptor. 

Step 2: Using the obtained information, the image data was parsed into three 4D arrays, one for each color component, were 
a block can be accessed as a whole and pixels within each block could also be accessed by providing the index of the block 
and the index of the pixel inside the block. 

Step3: A representative color was chosen for each block by averaging the values of all the pixels in each block. This results 
in three 8x8 arrays, one for each color component. This step is directly visualized in the first window of figure 2. 
Step4 each 8x8 matrix was transformed to the YCbCr color space. 

Step5: These will be again transformed by 8x8 DCT (Discrete Cosine Transform) to obtain three 8x8 DCT matrices of 
coefficients, one for each YCbCr component. 

Step6: The CLD descriptor was formed by reading in zigzag order six coefficients from the Y-DCT matrix and three 
coefficients from each DCT matrix of the two chrominance components. The descriptor is saved as an array of 12 values. 

2.4 DOMINANT COLOR DESCRIPTOR 

This descriptor specifies a set of dominant colors in an image. It is good to represent color features where a small 
number of colors are enough to characterize the color information. The extraction algorithm quantizes the pixel color values 
into a set of dominant colors 

2.5 SCALABLE COLOR DESCRIPTOR 

This descriptor performs color histogram in HSV color space encoded by a Haar transform [MPEG, 2002]. The 
extraction is done by quantizing the image into a 256 bin HSV color space histogram and then using the Haar transform to 
reduce the number of bins. The output of the method is a vector with integer components, presented by a histogram with 64, 
32 or 16 bins. The distance matching can be done either in the Haar coefficient domain or in the histogram domain. In the 
case where only the coefficient signs are retained, the matching can be done efficiently in the Haar coefficient domain by 
calculating the Hamming distance as the number of bit positions at which the binary bits are different using an XOR 
operation on the two descriptors to be compared. 

III. IMAGE RETRIEVAL 

3.1 QUERY RE WEIGHTING 

Some previous work keeps an eye on investigating what visual features are important for those images (positive 
examples) picked up by the users at each feedback (also called iteration in this paper). The notion behind QR is that, if the ith 
feature fi exists in positive examples frequently, which convert image feature vectors to weighted-term vectors in early 
version of Multimedia Analysis and Retrieval another interactive approach that allows the user to submit a coarse initial 
query to refine her/his need via a set of relevance feedbacks. 




Iteration 1 Iteration 2 Iteration 3 



Fig5. Relevance feedback with generalized QR technique 

3.2 QUERY POINT MOVEMENT 

Another solution for enhancing the accuracy of image retrieval is moving the query point toward the contour of the user's 
preference in feature space. QPM regards multiple positive examples as a new query point at each feedback. After several 
forceful changes of location and contour, the query point should be close to a convex region of the user's interest. Further, 
combining color and texture features in CBIR systems leads to more accurate results for image retrieval. 

3.3 QUERY EXPANSION 

Because QR and QPM cannot elevate the quality of RF, QEX has been another hot technique in the solution space of RF 
recently. That is, straightforward search strategies, such as QR and QPM, cannot completely cover the user's interest 
spreading in the broad feature space. 
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Retrieval offers four different ways to search for a matching photo: 

1. Searching through an XPath statement. 

2. Defining search options through textboxes with various options. 

3. Content based image retrieval using the visual descriptors ColorLayout and ScalableColor defined in the MPEG-7 
standard. 

4. Searching for a similar semantic description graph. 

3.4 XPATH SEARCH 

The first option is mainly used for developers and debugging of XPath statements, because all other retrieval mechanisms 
use XPath as query language. To search for matching documents using XPath requires detailed knowledge of the structure of 
the documents being searched, although basic statements like //*[contains(.,'textToSearchFor')] could be used querying 
documents without knowing the structure, but these statements only offer minimal retrieval features. 

3.5 SEMANTIC SEARCH 

The component of most interest is the panel offering a search mechanism for searching semantic descriptions. 

This component allows the user to define a graph with minimum one to maximum three nodes and two possible 
relations. An asterisk is used as wildcard. A search graph which only contains one node with a word defining this node will 
return each MPEG-7 document wherein a semantic object containing the specified word is 

IV. CONCLUSION AND FUTURE WORK 

The process is about image search engine, not by text RF to the image by an end user, but also by the visual 
contents available into the image itself. The process reduces the number of required iterations and improves overall retrieval 
performance and Time will be consumed. That we can guarantee to get intended target. 

The data stored and retrieved is accurate and gives enough information whenever the data is required in required 
format. All modules consist of necessary reports to help the user of the project to work easily and user -friendly. MPEG-7 
matches many of the current requirements for a metadata standard for usage in a personal digital 

Future work of this project is CBIR Systems prove that our approach is able to reach any given target image with 
fewer iterations in the worst and average cases. The application can be enhanced with the needs of the company. This project 
useful for image searching, in feature it is planned to connect semantic web-based image retrieval and facial recognition. 
This project increased efficiency and time consuming. 
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